Memory usage by DOS multitasking software

An OMNIVIEW Application Note
Copyright (c) 1989 Sunny Hill Software

Rev. 1: 09/05/89


        When evaluating memory usage from the standpoint of
        multitasking software two general rules apply:

            1) An 80386 system is the best to have.
            2) Except on a 80386, any hardware EMS is better
               than an emulator and the later the EMS
               software AND HARDWARE the better.

        In order to appreciate the validity of these rules, one
        must understand the limits of the PC hardware and their
        historical basis.

    Address lines, A Primer on Electric Rocks:

        If you look at a microprocessor chip you will see that it
        is a flat rock with a lot of flat wires stuck on the
        sides: it's an electric rock. Some of the electricity is
        used to power the chip and some is used for getting data
        in and out of the rock and some is used for controlling
        other rocks.  Regardless of the "kind" of wire, each is
        (for the purposes of our discussion), always either on or
        off. Since each of the lines have only two "states" the
        microprocessor forms the basis of a "binary" computer:
        Meaning based on two values.

        Memory chips are another special kind of binary
        electrical rock.  Some of the wires on microprocessors
        are dedicated to controlling memory rocks. These wires
        are called "address lines".  In order to form an address,
        each of the address lines is first given a value which
        represents whether it is "on" or "off".  When a wire is
        "turned on" it is said to have a value of one, and when
        "turned off" a value of zero.

        If you take all the values of the address lines together
        and add them up using base 2 arithmetic you come up with
        a numerical value that represents the address (or
        location) of some kind of data.  Each memory rock is like
        a street, it has lots of places where data live. Which
        memory rock is assigned a given range of addresses
        depends on where it is physically located in a given
        computer. The number of memory rocks that a computer can
        use at one time is determined by the number of address
        lines stuck on the side of the microprocessor.

        When IBM first introduced the PC in 1980, it was based on
        the 8088 microprocessor from Intel.  This chip evoluved
	from the 8080 chip that ran the popular CP/M operating
	system. One of the Big Advantages of the 8088 over the
	8080 was the one mega-byte address space of the new chip:
	It had 20 different address lines instead of just 16. It
	had more reach.

        When the folks at IBM were designing the PC, they decided
        to include an expansion bus on the PC mother board.
        Since a lot of cards that IBM could foresee being plugged
        into this bus would need to have memory rocks that the
        8088 could address, these cards would have to take up
        some of the PC's address locations. IBM decided that
        address locations above the 640K boundary would be
        reserved for these plug in cards.

        Since the PC's RAM limit was still ten times what the
        8080 had to offer, many at the time considered it an
        absurdly large amount of space to run programs in; but,
        to paraphrase Murphy, applications grow to consume all
        available resources. Memory soon became tight and those
        who decried the abundance of the 640K limit soon made
        those back issues unavailable.

        The 8080 and the 8088 each have a seperate set of address
        lines that they can use. One is for Input and Output
        (I/O) and the other is for memory. The things that you
        will find living at I/O addresses are generally known as
        "peripheral devices": Things like printers, disk drives,
        displays and keyboards. I/O lines are for controlling
        hardware.

        When memory became tight on CP/M machines, some
        manufacturers implemented a scheme known as "memory
        paging". When you build a machine that uses memory paging
        you take some of the I/O lines and connect them to yet
        another kind of special electric rock: These rocks switch
        the memory address lines from one set of memory rocks to
        another based on the numbers that appear on the I/O
        lines.  If the right set of numbers are set for a given
	memory rock, the electrical signals from the CPU's memory
	address lines are sent there and it becomes "addressable"
	by the CPU.

        In a paged memory system, when you fill up one set of
        memory rocks you can send some numbers down the I/O lines
        and magically have a brand new bunch of memory to use.
        It's like turning a page in a spiral notebook: If you
        want to see what you wrote down before, just turn back
        the page.

        When memory became tight on 8088 machines, manufactures
        again relied on memory paging to make room for more
	memory rocks.  Being manufacturers of snazzy modern
	devices they didn't want to make references to spiral
	notebooks so they renamed the trick. They called it using
	Expanded Memory.


    Types of Expanded Memory:

        "LIM" refers to a specification for constructing paged
        memory cards (and drivers for them) that was established
        by a committee representing Lotus, Intel and Microsoft;
        the numbers you often see associated with this acronymn
        describe the version number of the specification. "EEMS"
        refers to another specification for the same stuff called
        the Enhanced Expanded Memory Specification which was
        developed by a committee representing AST, Quadram and
        Ashton-Tate. EEMS improved on LIM 3.2; LIM 4.0 includes
        the improvements from EEMS plus some new things of its
        own and has been accepted by AST, Quadram and
        Ashton-Tate. Regardless of the name, these specifications
        all describe some way of paging through memory.

	Another memory specification you have probably heard
	about is called the eXtended Mememy Specification (XMS).
	This was developed by AST as well the the people who
	wrote the LIM specification. XMS does not deal with
	paged memory hardware but with extended memory which is,
	by definition, the memory starting at the one megabyte
	boundary. While this type of memory has its uses (loading
	TSRs, etc.), it is has no direct use in multitasking.

        LIM 3.2 is the least common denominator of the three EMS
	standards. With this kind of memory HARDWARE all EMS
	pages have an address in a suburb of the mother board
	memory called the Page Frame. Here is some census data on
	the Page Frame:

            1) Each EMS page of memory is 16K bytes long.
            2) The page frame holds four of these 16K pages (a
               total of 64K bytes).
            3) The page frame is located above 640K and its
               lowest address is evenly divisible by 16K (it
               starts and ends on a page boundary).
            4) The lowest allowable address is at segment 0C000h
               (768K) and its highest starting address is at
               segment 0E000h (896K).

        It is important to remember that real EMS works by
        sending electrons down the I/O wires and causing a
        hardware switch to change which page of memory the
        microprocessor can use. The switches that control the
        accessibility of the pages of memory are called "page
        registers".  A LIM 3.2 card has four page registers and
	thus four "physical memory pages".  LIM 3.2 hardware
	provides 64K of additional, simultaneously addressable
	memory above 640K.

        It is possible to run a program in the page frame
        (OVSHELL runs there when you use XSHELL to load it there)
        but the program must be small and, in doing so, it
        violates some rules for "well behaved" programs.  Because
        of the limited size and the nature of the page frame,
        multitaskers use LIM 3.2 for swapping programs but not
        for running applications.

        Capitalizing on the limitations of LIM 3.2, the committee
        behind the EEM standard decided to produce hardware with
        64 page registers. This made available 1M bytes of memory
        (1024K) that the EMS hardware could simultaneously keep
        track of. EEMS also did away with the limitation that EMS
        memory could only be "mapped" into the page frame: This
        meant that EMS would no longer be limited to the suburbs
        but could move uptown to the mother board.

        The LIM 4.0 standard provided for a maximum of 255 page
        registers. It also provides for the naming of allocated
	memory pages (those that are in use) and for a hardware
	mechanism for keeping track of the EMS context known as
	Alternate Mapping Register Sets (AMRS). The EMS context
	is the combined state of the page registers (that is, the
	record of which memory rocks are in use at a given time).
	Copies of the LIM 4.0 standard are available from the
	sponsors.


    Swapping:

        OMNIVIEW treats LIM 3.2 memory in pretty much the same
        way as a RAM disk with the following exceptions:

            1) When LIM 3.2 memory fills up, OMNIVIEW will begin
               swapping programs to the drive that was current
               when OMNIVIEW was started.  You can over-ride this
	       drive selection with the SWAP environment variable
	       but you can not chain drives together to create a
               larger swapping space.
            2) You can limit the amount of EMS that a process can
               access.

        Whether using EMS or disk, when loading a swappable
        program, there are three memory requirements that must be
        fulfilled:

            1) There must be at least as much free memory
               available as is "required" by the partition.
            2) The partition must be large enough to house the
               programs that are run inside it.
            3) There must be enough swapping space to hold all
               the currently active partitions as well as the
               partition being loaded.

        The first two requirements also apply to non-swappable
        processes. If the the first or third requirements are not
        met, OMNIVIEW will display a "Not enough memory" message.
        DOS will state that there is "Not enough memory to load
        program" if the second requirement is not met.


    Backfilling and concurrent processes:

        Theoretically, with EEMS or LIM 4.0 you can map memory
	into the lower 640K and you can have it start anywhere
	and be as big as you'ld like (within the limit of the
	8088's address space). In reality this is not so.

        The problem with using EMS on the motherboard is one of
        the nature of the hardware and of the assumptions made
        about it. Remember that the page registers determine
	which address lines are connected to which memory rocks
	on an EMS board. Also remember that there are only a certain
	number of address lines on the CPU. It is also helpful to
	realize that electrons are indecisive, easily confused
	and dangerous when dazed.

        If you tell an EMS board to map some memory rocks into
        the lower 640K on a 640K motherboard then some
        combination of turned on address lines will point the way
        to two different memory rocks.  Electrons traveling down
        the address lines in such a situation will not know which
        way to go.  A struggle will ensue between the memory
        rocks over the favor of the electrons and the stability
        of your system will be laid to waste in the froe.

        To make full use of the expanded memory hardware on an
        EEMS board you must first REMOVE MEMORY from the
        motherboard. Once this is done the addresses on the
        motherboard will be vacant and you can safely occupy them
        with the rocks from EMS. This process of vacating and
        rehabitating the mother board real estate is known as
        "back filling".

        What you do with the chips that you had to remove from
        the mother board depends on whether or not they will fit
        on your particular EMS board and the current market price
        of used DRAMS. On some machines, it also depends on your
        BIOS.

        When a machine is first turned on, it starts executing a
        program that is kept in a Read Only Memory (ROM) rock.
        This program is called the Pre-Operational Startup Test
        (POST). Some BIOSs are written by people who didn't
        think about using EMS memory and expected the motherboard
	to contain some minimum amount of RAM. Part of the job
	of the POST is to verify that this memory is operational
	by writing and reading back some value for each memory
	address in the presupposed range. If the POST writes to a
	memory address where no memory rock is installed, it will
	read back garbage; the memory test will fail, and the
	machine will never start up.

        Your hardware manuals should state the minimum amount of
        RAM you can have on the motherboard and still start it
        up.  If you can't discern this from reading the manual
        you will have to get that information from the people who
        sold you the machine.  If all else fails you can always
        experiment on your own, removing one (or possibly two)
	bank(s) of chips at a time until it fails to start up.
	On some machines the memory test works on banks of memory
	and you may be able to substitute smaller memory chips in
	the required banks to reduce the conventional memory.

        Once you have removed all but the minimum amount of RAM
        from the mother board, you should tell your Expand Memory
	Manager (EMM) software about it so that it can move its
	memory into the vacant addresses. The documentation
	for your EMS card should tell you how to do this. When
	this is done, your machine should be backfilled the next
	time you start it up. You can verify that all went well
	by running CHKDSK, MAPMEM or other RAM measuring program
	and insuring that the amount of system RAM exceeds the
	conventional memory on the motherboard.

        Once backfilling is completed then chunks of memory,
        equivalent in size to the amount of memory that was
        backfilled, can be paged in and out of the 8088 address
	space with the flick of an I/O line.  Since hardware
	memory paging is quite fast the programs which live in
	the backfilled EMS can be run in the background - as long
	as they can be switched in.

        To illustrate the implications of this last point, let's
        assume that a you had an AT motherboard requiring at
        least 512K to start up. Let's further assume that the
        board was backfilled from a 1M EMS board to the 640K
        boundary with EEMS memory and that you had 512K free
        after loading OMNIVIEW. In this case you would have 384K
        of free conventional memory and 128K of free EMS memory
	addressable in the lower 640K.

        The term Transient Program Area (TPA) describes the area
	of memory available after the operating system and all
	its extensions are loaded: It is the amount of RAM you
	have to run normal (transient vs. resident) programs. In
	the example above the size of the TPA is 512K.

        A partition must completely fit into the backfilled block
        of EMS in order for it to be moved around using hardware
        paging. The alternative to hardware paging is to
        physically copy every byte of the existing partition from
        the TPA into EMS and then to copy every byte of the next
        partition from EMS into the TPA.

        Physical copying would have to happen each time a new
        program is scheduled to run and, since it takes a
        relatively long time to copy partitions compared to the
        time the partitions get to run, the only thing that would
        be happening in the system is the copying of process
        data.  Obviously, it is undesirable to have the
        multitasker be the only program running in the computer.
        Only those processes that can be fit in the TPA without
        having to physically copy them there will run
	concurrently.

        In our hypothetical machine, five 128K swappable programs
        could be loaded as well as one 384K non-swappable
        process. Since each of the above processes could run
        concurrently they would all be described as "waiting" by
        the OMNIVIEW status program (OVSTAT.EXE).  Note that the
        total size of all partitions is well over 640K.

        If the larger partition had been set up as swappable it
        would have used up EMS swapping space, leaving room for
        only three other small partitions; since it would have to
        be copied in and out of the TPA it would be shown as
        "swapped" by OVSTAT if was not already in the TPA.

        If the big partition above had been made greater than
        384K and swappable then, whenever it was in the TPA, it
        would be the only process running.  The reason for this
        is that it would be taking up part of the backfilled
        memory needed by the other processes, blocking them from
        being paged in.  If the big partition above had been made
        greater than 384K and nonswappable then it could never be
        moved out of the way of the other processes and it would
        be the the only thing running as long as it was active.

        For a variety of reasons, even with hardware page
        mapping, the time required to switch between processes on
        anything less than a 80386 based system precludes
        reliable process switches at a rate neccesary for
        handling hardware interrupts in real time. Consequently,
        with 80286 and earlier processors, communications
        programs must be non-swappable.


    Video filling and expanding the TPA:

        Remember that the TPA is the amount of free RAM after all
        the resident programs are loaded. Also remember that the
	640K barrier was established to allow peripheral cards
	room to fit into the 8088's address space. The standard
	adapter cards with the lowest addresses are used for
	video output; we can map EMS memory between the bottom
	address of these cards and the 640K boundary since there
	is no installed memory to cause a conflict. The practice
	of mapping EMS between the 640K boundary and the bottom
	of the installed video adapter memory is known as "video
	filling" and OMNIVIEW does this automatically when
	possible.

        The amount of memory to be gained by the video fill
	depends on the video adapter installed in the system.
	The size of the TPA after topfilling depends on the size
	of the topfilled region and the amount of memory that was
	free before OMNIVIEW was loaded.  The table below shows
	the possibilites for each standard adapter type.

            Video        Memory          Effective
            Adapter      Gained          System Memory
	  ----------------------------------------------
            EGA/VGA      0               640K
            MDA/Herc     68K             704K
            CGA          90K             736K


    Allocating upper memory blocks using OMNIHIGH:

        Included on the distribution disk is a program called
        OMNIHIGH.COM. This program will allocate 48K of EMS
        memory in the region between the top of the system video
        adapter and the bottom of the page frame. This memory
        will then be used to load the OMNIVIEW.EXE file into this
        memory region, significantly reducing OMNIVIEW's use of
        the TPA.

        If OMNIHIGH issues an error message saying "Cannot
        allocate upper memory blocks" then the program could not
	gain access to the required 48K EMS memory block. This
	could be because that that region of memory is already
	being used. On a 80286 or earlier system it could also be
	because your EMS software does not support the neccesary
	functions or because there are not enough physical EMS
	pages to establish the memory region. On an 80386 system
	it could be that you did not tell the memory manager to
	"include" the necessary EMS region or that the region you
	stated confilcted something else in the system.


    Extended Memory, RAM disks and EMS Emulators:

        The 80286 processor has 24 address lines providing a 16M
        byte address space. As we mentioned earlier, the upper
	15M bytes of the '286 address space are known as
	"extended memory". In order to access extended memory the
	machine must be in the "protected mode" of operation. DOS
	programs however operate in what is known as the "real
	mode" and are incompatible with the protected mode
	operation of the '286. Consequently, extended memory is
	inaccessible by DOS programs without first switching the
	microprocessor into protected mode, reading the data
	into some place in the lower 1M byte address space and
	then switching back into "real" mode.

        In order to switch between protected and real mode the
        machine must be reset, this has been likened to "turning
        off the car to shift gears". Additionaly, during the
        switch to real mode, interrupts can be lost resulting in
        communications or other interrupt related errors.
        Regardless of the effectiveness of this approach it is
        what is required by a DOS program to be able to access
        extended memory and is used by VDISK and other programs.
        OMNIVIEW does not utilize extended memory directly.

        EMS emulators work in essentially the same way as VDISK
        or other extended memory RAM disk programs. The only
        difference is that VDISK wants you to think it's
        controlling a fast disk drive while the EMS emulators
        want you to think they're controlling EMS hardware. The
        process of switching to and from protected mode involves
        a fair amount of overhead by itself. Additionally, all
	the data from the partition in the TPA must be physically
	copied into the page and from there into extended memory
	then the data for the partition in EMS must be physically
	copied from extended memory into the page frame and from
	there into the TPA.  Another complication with an EMS
	emulator is that to provide a full page frame it must
	take at least 64K away from the memory that would
	otherwise be in the TPA.  Setting up an extended memory
	RAM disk may be a better solution.


    80386 based system operations:

        On an 80386 system EMS hardware capabilities can be
        provided using the virtual machine capabilites introduced
        with that chip.

        In order to utilize these features of the 80386 a Virtual
        Control Program (VCP) such as Qualitas' 386^MAX is
        required. These programs are loaded from your CONFIG.SYS
        file and eliminate the need for physically backfilling
	the motherboard. These programs also automatically
	perform video filling and provide the capability to
	allocate upper EMS blocks. In addition to all this,
	386^MAX can also load other device drivers and TSR's into
	the 640K to 1M byte addres range and provides XMS
	support.

        Using 386^MAX on a 20MHz '386, OMNIVIEW can run up to ten
        programs concurrently - answering 100,000 interrupts per
	second. Total impact on the TPA will be 10-30K depending
	on your system.

        Because 386^MAX is a software product there are some
        things things that you must do which would not be
	required by a hardware EMS product. You must specify the
	range of addresses to use for any upper EMS blocks and
	also specify the number of AMRSs that you wish it to use.
	Also, if you wish to exclude any of the '386 memory from
	conversion to EMS, you must tell it this as well.

        The following entry in the CONFIG.SYS file is recommended
        for 386^MAX:

        DEVICE=386MAX INCLUDE=D400-E000 AMRS=11 [others] SCREEN

	"INCLUDE" sets up the 48K EMS block at the address
	specified. This address is satisfactory for most systems.
	If you have OMNIHIGH complains about not being able to
	"allocate upper memory" then run 386MAX.COM with the '/E'
	option and verify the that this block was "included": If
	it was then the block has been used by another program.
	If the block is included then either you made a mistake
	typing in the command or else the specified region
	conflicts with something else in the system, probably
	with a ROM on disk controller, network or terminal
	emulator card. You can find the location of these ROMs by
	running 386MAX.COM  with the '/R' option and then change
	the include address to avoid the ROMs.  If there is no
	way to fit a 48K block into 'high DOS' then you will have
	to load OMNIVIEW into low memory.

	"AMRS" statement allocates Alternate Mapping Register
	Sets. You should establish the number of AMRSs to be one
	more than the number of processes that you will want to
	run simultaneously.

	SCREEN tells 386^MAX to virtualise the video hardware
	used by most programs. This allows programs that write
	directly to the screen in text and CGA graphics modes to
	operate in the background without interfering with the
	foreground program's display.

	"[others]" refers to any other arguments you
	have to supply for your system.  Consult the 386^MAX
	manual and README file for details.